Permutation matrix

In mathematics, in matrix theory, a permutation matrix is a square binary matrix that has exactly one entry 1 in each row and each column and 0s elsewhere. Each such matrix represents a specific permutation of m elements and, when used to multiply another matrix, can produce that permutation in the rows or columns of the other matrix.

1 Definition
2 Properties
3 Notes
4 Examples
- 4.1 Permutation of rows and columns
- 4.2 Permutation of rows
5 Solving for P
- 5.1 Example
6 Explanation
7 Matrices with constant line sums
8 See also

Definition

Given a permutation π of m elements,

$\pi�: \lbrace 1, \ldots, m \rbrace \to \lbrace 1, \ldots, m \rbrace$

given in two-line form by

$\begin{pmatrix} 1 & 2 & \cdots & m \\ \pi(1) & \pi(2) & \cdots & \pi(m) \end{pmatrix},$

its permutation matrix is the m × m matrix P_π whose entries are all 0 except that in row i, the entry π(i) equals 1. We may write

$P_\pi = \begin{bmatrix} \mathbf e_{\pi(1)} \\ \mathbf e_{\pi(2)} \\ \vdots \\ \mathbf e_{\pi(m)} \end{bmatrix},$

where $\mathbf e_j$ denotes a row vector of length m with 1 in the jth position and 0 in every other position.

Properties

Given two permutations π and σ of m elements and the corresponding permutation matrices P_π and P_σ

$P_{\sigma} P_{\pi} = P_{\pi\,\circ\,\sigma}$

This somewhat unfortunate rule is a consequence of the definitions of multiplication of permutations (composition of bijections) and of matrices, and of the choice of using the vectors $\mathbf{e}_{\pi(i)}$ as rows of the permutation matrix; if one had used columns instead then the product above would have been equal to $P_{\sigma\,\circ\,\pi}$ with the permutations in their original order.

As permutation matrices are orthogonal matrices (i.e., $P_{\pi}P_{\pi}^{T} = I$ ), the inverse matrix exists and can be written as

$P_{\pi}^{-1} = P_{\pi^{-1}} = P_{\pi}^{T}.$

Multiplying $P_{\pi}$ times a column vector g will permute the rows of the vector:

$P_\pi \mathbf{g} = \begin{bmatrix} \mathbf{e}_{\pi(1)} \\ \mathbf{e}_{\pi(2)} \\ \vdots \\ \mathbf{e}_{\pi(n)} \end{bmatrix} \begin{bmatrix} g_1 \\ g_2 \\ \vdots \\ g_n \end{bmatrix} = \begin{bmatrix} g_{\pi(1)} \\ g_{\pi(2)} \\ \vdots \\ g_{\pi(n)} \end{bmatrix}.$

Now applying $P_\sigma$ after applying $P_\pi$ gives the same result as applying $P_{\pi\circ\sigma}$ directly, in accordance with the above multiplication rule: call $P_\pi\mathbf{g} = \mathbf{g}'$ , in other words

$g'_i=g_{\pi(i)}\,$

for all i, then

$P_\sigma(P_\pi(\mathbf{g})) = P_\sigma(\mathbf{g}') = \begin{bmatrix} g'_{\sigma(1)} \\ g'_{\sigma(2)} \\ \vdots \\ g'_{\sigma(n)} \end{bmatrix} = \begin{bmatrix} g_{\pi(\sigma(1))} \\ g_{\pi(\sigma(2))} \\ \vdots \\ g_{\pi(\sigma(n))} \end{bmatrix}.$

Multiplying a row vector h times $P_{\pi}$ will permute the columns of the vector by the inverse of $P_{\pi}$ :

$\mathbf{h}P_\pi = \begin{bmatrix} h_1 \; h_2 \; \dots \; h_n \end{bmatrix} \begin{bmatrix} \mathbf{e}_{\pi(1)} \\ \mathbf{e}_{\pi(2)} \\ \vdots \\ \mathbf{e}_{\pi(n)} \end{bmatrix} = \begin{bmatrix} h_{\pi^{-1}(1)} \; h_{\pi^{-1}(2)} \; \dots \; h_{\pi^{-1}(n)} \end{bmatrix}$

Again it can be checked that $(\mathbf{h}P_\sigma)P_\pi = \mathbf{h}P_{\pi\circ\sigma}$ .

Notes

Let S_n denote the symmetric group, or group of permutations, on {1,2,...,n}. Since there are n! permutations, there are n! permutation matrices. By the formulas above, the n × n permutation matrices form a group under matrix multiplication with the identity matrix as the identity element.

If (1) denotes the identity permutation, then P₍₁₎ is the identity matrix.

One can view the permutation matrix of a permutation σ as the permutation σ of the columns of the identity matrix I, or as the permutation σ⁻¹ of the rows of I.

A permutation matrix is a doubly stochastic matrix. The Birkhoff–von Neumann theorem says that every doubly stochastic matrix is a convex combination of permutation matrices of the same order and the permutation matrices are the extreme points of the set of doubly stochastic matrices. That is, the Birkhoff polytope, the set of doubly stochastic matrices, is the convex hull of the set of permutation matrices.

The product PM, premultiplying a matrix M by a permutation matrix P, permutes the rows of M; row i moves to row π(i). Likewise, MP permutes the columns of M.

The map S_n → A ⊂ GL(n, Z₂) is a faithful representation. Thus, |A| = n!.

The trace of a permutation matrix is the number of fixed points of the permutation. If the permutation has fixed points, so it can be written in cycle form as π = (a₁)(a₂)...(a_k)σ where σ has no fixed points, then e_a₁,e_a₂,...,e_{a_k} are eigenvectors of the permutation matrix.

From group theory we know that any permutation may be written as a product of transpositions. Therefore, any permutation matrix P factors as a product of row-interchanging elementary matrices, each having determinant −1. Thus the determinant of a permutation matrix P is just the signature of the corresponding permutation.

Examples

Permutation of rows and columns

When a permutation matrix P is multiplied with a matrix M from the left it will permute the rows of M (here the elements of a column vector),
when P is multiplied with M from the right it will permute the columns of M (here the elements of a row vector):

reflections

These arrangements of matrices are reflections of those directly above.
This follows from the rule $\left( \mathbf{A B} \right) ^\mathrm{T} = \mathbf{B}^\mathrm{T} \mathbf{A}^\mathrm{T} \,$ (Compare: Transpose)

Permutation of rows

The permutation matrix P_π corresponding to the permutation : $\pi=\begin{pmatrix} 1 & 2 & 3 & 4 & 5 \\ 1 & 4 & 2 & 5 & 3 \end{pmatrix},$ is

$P_\pi = \begin{bmatrix} \mathbf{e}_{\pi(1)} \\ \mathbf{e}_{\pi(2)} \\ \mathbf{e}_{\pi(3)} \\ \mathbf{e}_{\pi(4)} \\ \mathbf{e}_{\pi(5)} \end{bmatrix} = \begin{bmatrix} \mathbf{e}_{1} \\ \mathbf{e}_{4} \\ \mathbf{e}_{2} \\ \mathbf{e}_{5} \\ \mathbf{e}_{3} \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 & 0 \end{bmatrix}.$

Given a vector g,

$P_\pi \mathbf{g} = \begin{bmatrix} \mathbf{e}_{\pi(1)} \\ \mathbf{e}_{\pi(2)} \\ \mathbf{e}_{\pi(3)} \\ \mathbf{e}_{\pi(4)} \\ \mathbf{e}_{\pi(5)} \end{bmatrix} \begin{bmatrix} g_1 \\ g_2 \\ g_3 \\ g_4 \\ g_5 \end{bmatrix} = \begin{bmatrix} g_1 \\ g_4 \\ g_2 \\ g_5 \\ g_3 \end{bmatrix}.$

Solving for P

If we are given two matrices A and B which are known to be related as $B = P A P^{-1}$ , but the permutation matrix P itself is unknown, we can find P using eigenvalue decomposition:

$A = Q_A \Lambda Q_A^{-1}$

$B = Q_B \Lambda Q_B^{-1}$

where $\Lambda$ is a diagonal matrix of eigenvalues, and $Q_A$ and $Q_B$ are the matrices of eigenvectors. The eigenvalues of $A$ and $B$ will always be the same, and P can be computed as $P = Q_B Q_A^{-1}$ . In other words, $P Q_A = Q_B$ , which means that the eigenvectors of B are simply permuted eigenvectors of A.

Example

Let $A$ and $B$ be two $3x3$ matrices such that

$A = \begin{bmatrix} 0 & 1 & 2 \\ 1 & 0 & 1.5 \\ 2 & 1.5 & 0 \end{bmatrix} \text{, and}$

$B = \begin{bmatrix} 0 & 1 & 1.5 \\ 1 & 0 & 2 \\ 1.5 & 2 & 0 \end{bmatrix}.$

Let $P$ be the $3x3$ matrix permuting $A$ into $B$ such that

$P = \begin{bmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix}.$

Multiplying $A$ with $P$ from the left permutes the rows of $A$ whereas and multiplying $A$ from the right permutes the columns of $A$ . Therefore $P$ permutes the first and second row and first and second column of $A$ to produce $B$ (as visual inspection confirms). So $A$ and $B$ share the same eigenvalues by the discussion above. After finding and diagonalizing these eigenvalues, the resultant diagonal matrix $\Lambda$ is

$\Lambda = \begin{bmatrix} -2.09394 & 0 & 0 \\ 0 & 0.9433954 & 0 \\ 0 & 0 & 3.037337 \end{bmatrix}$

and the $Q_A$ matrix of eigenvectors for $A$ is

$Q_A = \begin{bmatrix} -0.60130 & 0.54493 & 0.58437 \\ -0.25523 & -0.82404 & 0.50579 \\ 0.75716 & 0.15498 & 0.63458 \end{bmatrix}$

and the $Q_B$ matrix of eigenvectors for $B$ is

$Q_B = \begin{bmatrix} -0.25523 & -0.82404 & -0.50579 \\ -0.60130 & 0.54493 & -0.58437 \\ 0.75716 & 0.15498 & -0.63458 \end{bmatrix}.$

Comparing the first eigenvector (i.e., the first column) of both we can write the first column of $P$ by noting that the first element ( $Q_{A(1,1)} = -0.60130$ ) matches the second element ( $Q_{B(2,1)}$ ), thus we put a 1 in the second element of the first column of $P$ . Repeating this procedure, we match the second element ( $Q_{A(2,1)}$ ) to the first element ( $Q_{B(1,1)}$ ), thus we put a 1 in the first element of the second column of $P$ ; and the third element ( $Q_{A(3,1)}$ ) to the third element ( $Q_{B(3,1)}$ ), thus we put a 1 in the third element of the third column of $P$ .

The resulting $P$ matrix is:

$P = \begin{bmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix}.$

And comparing to the $P$ matrix from above, we find they are the same.

Explanation

A permutation matrix will always be in the form

$\begin{bmatrix} \mathbf{e}_{a_1} \\ \mathbf{e}_{a_2} \\ \vdots \\ \mathbf{e}_{a_j} \\ \end{bmatrix}$

where e_{a_i} represents the ith basis vector (as a row) for R^j, and where

$\begin{bmatrix} 1 & 2 & \ldots & j \\ a_1 & a_2 & \ldots & a_j\end{bmatrix}$

is the permutation form of the permutation matrix.

Now, in performing matrix multiplication, one essentially forms the dot product of each row of the first matrix with each column of the second. In this instance, we will be forming the dot product of each row of this matrix with the vector of elements we want to permute. That is, for example, v= (g₀,...,g₅)^T,

e_{a_i}·v=g_{a_i}

So, the product of the permutation matrix with the vector v above, will be a vector in the form (g_a₁, g_a₂, ..., g_{a_j}), and that this then is a permutation of v since we have said that the permutation form is

$\begin{pmatrix} 1 & 2 & \ldots & j \\ a_1 & a_2 & \ldots & a_j\end{pmatrix}.$

So, permutation matrices do indeed permute the order of elements in vectors multiplied with them.

Matrices with constant line sums

The sum of the values in each column or row in a permutation matrix adds up to exactly 1. A possible generalization of permutation matrices is nonnegative integral matrices where the values of each column and row add up to a constant number c. A matrix of this sort is known to be the sum of c permutation matrices.

For example in the following matrix M each column or row adds up to 5.

$M = \begin{bmatrix} 5 & 0 & 0 & 0 & 0 \\ 0 & 3 & 2 & 0 & 0 \\ 0 & 0 & 0 & 5 & 0 \\ 0 & 1 & 2 & 0 & 2 \\ 0 & 1 & 1 & 0 & 3 \end{bmatrix}.$

This matrix is the sum of 5 permutation matrices.